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Let S denote an infinite alphabet - a set that consists of infinitely many symbols. A word 
w = aoboaibi ■ ■ ■ a n b n of even length over £> can be viewed as a directed graph Gw whose vertices 
are the symbols that appear in w, and the edges are (ao, bo), (ai, fci), . . . , (o n , b n ). For a positive 
integer m, define a language lZ m such that a word w = aobo ■ ■ ■ a n b n S TZm if and only if there is 
a path in the graph G w of length < m from the vertex ao to the vertex b n . 

We establish the following hierarchy theorem for pebble automata over infinite alphabet. For 
every positive integer fc, (i) there exists a fc-pebble automaton that accepts the language TZ 2 k — i> 
(ii) there is no fc-pebble automaton that accepts the language 7^ 2 fc+i_2- Based on this result, we 
establish a number of previously unknown relations among some classes of languages over infinite 
alphabets. 

Categories and Subject Descriptors: F.l.l [Models of Computation]: Pebble automata; F.4.1 
[Mathematical Logic]: Computational logic 

General Terms: Languages 

Additional Key Words and Phrases: Pebble automata, Graph reachability, Infinite alphabets 



1. INTRODUCTION 

Logic and automata for words over finite alphabets arc relatively well understood 
and recently there is a broad research activity on logic and automata for words and 
trees over infinite alphabets. Partly, the study of infinite alphabets is motivated by 
the need for formal verification and synthesis of infinite-state systems and partly, 
by the search for automated reasoning techniques for XML. There has been a 
significant progress in this area, see [Bjorkhmd and Schwentick 2007; Bojanczyk 
et al. 2011. a; Demri and Lazic 2009; Kaminski and Francez 1994; Neven et al. 
2004; Segoufin 2006] and this paper aims to contribute to the progress. 

Roughly speaking, there are two approaches to studying languages over infinite 
alphabets: logic and automata. Below is a brief summary on both approaches. For 
a more comprehensive survey, we refer the reader to [Segoufin 2006]. The study 
of languages over infinite alphabets starts with the introduction of finite-memory 
automata (FMA) in [Kaminski and Francez 1994], also known as register automata 
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(RA), that is, automata with a finite number of registers. From here on, we write 
RA„ to denote RA with n registers. 

The study of RA was continued and extended in [Neven et al. 2004], in which 
pebble automata (PA) were also introduced. Each of these models has its own ad- 
vantages and disadvantages. Languages accepted by RA are closed under standard 
language operations: intersection, union, concatenation, and Klccnc star. In addi- 
tion, from the computational point of view, RA are a much easier model to handle. 
Their emptiness problem is decidable, whereas the same problem for PA is not. 
However, the PA languages possess a very nice logical property: closure under all 
boolean operations. 

Recently there is a more general model of RA introduced in [Bojanczyk et al. 
201 l.b], that builds on the idea of nominal sets. In this model the structure for the 
symbols is richer. In addition to equality test, it allows for total order and partial 
order tests among the symbols. 

In [Bouycr 2002] data words are introduced, which are an extension of words 
over infinite alphabet. Data words are words in which each position carries both 
a label from a finite alphabet, and a data value from an infinite alphabet. The 
paper [Bojanczyk et al. 2011. a] studies the logic for data words, and introduced 
the so-called data automata. It was shown that data automata define the logic 
3MS0 2 (~, <, +1), the fragment of existential monadic second order logic in which 
the first order part is restricted to two variables only, with the signatures: the 
data equality ~, the order < and the successor +1. An important feature of data 
automata is that their emptiness problem is decidable, even for infinite words, 
but is at least as hard as reachability for Petri nets. It was also shown that the 
satisfiability problem for the three-variable first order logic is undecidable. 

Another logical approach is via the so called linear temporal logic with freeze 
quantifier, introduced in [Demri et al. 2005] and later also studied in [Demri and 
Lazic 2009]. Intuitively, these are LTL formula equipped with a finite number of 
registers to store the data values. We denote by LTL^[X,U], the LTL with freeze 
quantifier, where n denotes the number of registers and the only temporal operators 
allowed are the neXt operator X and the Until operator U. It was shown that 
alternating RA„ accept all LTL^[X,U] languages and the emptiness problem for 
alternating RAi is decidable. However, the complexity is non primitive recursive. 
Hence, the satisfiability problem for LTL^(X, U) is decidable as well. Adding one 
more register or past time operators, such as X -1 or U _1 , to LTLj(X,U) makes the 
satisfiability problem undecidable. In [Lazic 2011] a weaker version of alternating 
RAi, called safety alternating RAi, is considered, and the emptiness problem is 
shown to be EXPSPACE-complete. 

In this paper we continue the study of pebble automata (PA) for strings over 
infinite alphabets introduced in [Neven et al. 2004]. The original PA for strings 
over finite alphabet was first introduced and studied in [Globerman and Harel 1996]. 
Essentially PA are finite state automata equipped with a finite number of pebbles, 
The pebbles arc placed on or lifted from the input word in the stack discipline - 
first in last out - and are intended to mark the positions in the input word. One 
pebble can only mark one position and the most recently placed pebble serves as the 
head of the automaton. The automaton moves from one state to another depending 
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on the equality tests among data values in the positions currently marked by the 
pebbles, as well as the equality tests among the positions of the pebbles. 

As mentioned earlier, PA languages possess a very nice logical property: closure 
under all boolean operations. Another desirable property of PA languages is, as 
shown in [Neven et al. 2004], that nondeterminism and two-way-ness do not increase 
the expressive power of PA [Neven et al. 2004, Theorem 4.6]. Moreover, the class 
of PA languages lies strictly in between FO(~, <, +1) and MSO(~, <, +1) [Neven 
et al. 2004, Theorems 4.1 and 4.2]. 

Moreover, looking at the stack discipline imposed on the placement of the pebbles, 
one can rightly view PA as a natural extension of FO(~, <, +1). To simulate a first- 
order sentence of quantifier rank k, a pebble automaton with k pebbles suffices: one 
pebble for each quantifier depth. (See Proposition 2.5.) 

In this paper we study PA as a model of computation for the directed graph reach- 
ability problem. To this end, we view a word of even length w = a boa\b\ ■ ■ ■ a n b n 
over an infinite alphabet as a directed graph G w = (V w , E w ) with the symbols that 
appear in an&o a i°i ■ 1 ■ ci n b n as the vertices in V w and (an, bo), . . . , (a n , b n ) as the 
edges in E w . We say that the word w induces the graph G w . 

We prove that for any positive integer k, k pebbles are sufficient for recognizing 
the existence of a path of length 2 fe — 1 from the vertex a to the vertex b n , but 
are not sufficient for recognizing the existence of a path of length 2 fc+1 — 2 from the 
vertex an to the vertex b n . Based on this result, we establish the following rela- 
tions among the classes of languages over infinite alphabets which were previously 
unknown. 

(1) A strict hierarchy of the PA languages based on the number of pebbles. 

(2) The separation of monadic second order logic from the PA languages. 

(3) The separation of one-way deterministic RA languages from PA languages. 

Some of these results settle questions left open in [Neven et al. 2004; Segoufin 2006]. 

Although, in general, the emptiness problem for PA is undecidable, we believe 
that our study may contribute to the technical aspect of reasoning on classes of 
languages with decidable properties. For example, in Section 4 a similar technique 
is used to obtain separation result for LTLf[X, U] languages, a class of languages 
with decidable satisfiability problem. 

Related work. A weaker version of PA, called top-view weak PA was introduced 
and studied in [Tan 2010], where it was also shown that the emptiness problem is 
decidable. The results in this paper are not implied from that paper, as here the 
main concern is separation results. In fact, some of the separation results here also 
hold for the model in [Tan 2010]. 

There is also an analogy between our result with the classical first-order quantifier 
lower bounds for directed graph (s, ^-reachability which states the following: There 
is a first order sentence of quantifier rank k to express the existence of a path of 
length < to from the source node s to the target node t if and only if to < 2 k . See, 
for example, [Turan 1984]. 

As far as we can see, our result is actually a tighter version of the classical result 
for first-order logic. It is tighter because PA is shown to be stronger than first-order 
logic (Proposition 2.5). In particular pebble automata do have states, thus, enjoy 
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the usual benefits associated with automata, like counting the number of edges, or 
the number of neighbours up to < to, > m, or mod to, for an arbitrary but fixed 
positive integer to, without increasing the number of pebbles. 

Other related results are those established in [Ajtai and Fagin 1990; Fagin et al. 
1995; Schwentick 1996]. To the best of our knowledge, those results have no con- 
nection with the result in this paper. In [Ajtai and Fagin 1990] it is established that 
(s, ^-reachability in directed graph is not in monadic NP*, while in [Fagin et al. 
1995; Schwentick 1996] it is established that undirected graph connectivity is not in 
monadic NP. However, no lower bound on first-order quantifier rank is established. 

Organization. This paper is organized as follows. In Section 2 we review the 
monadic second-order logic MSO(~, <,+l) and pebble automata (PA) for words 
over infinite alphabet. Section 3 is the core of the paper in which we present our 
main results. In Section 4 we discuss how to adjust our results and proofs presented 
in Section 3 to a weaker version of PA, called weak PA, whose relation to the logic 
LTLj(X,U) is presented in Section 5. 

2. MODELS OF COMPUTATIONS 

In Section 2.1 we recall the definition of alternating pebble automata from [Neven 
et al. 2004], and in Section 2.2 a logic for languages over infinite alphabets. 

We shall use the following notation: D is a fixed infinite alphabet not containing 
the left-end marker < or the right-end marker >. The input word to an automaton 
is of the form <\w>, where w G J)*. Symbols of D are denoted by lower case 
letters a, b, c, etc., possibly indexed, and words over D by lower case letters u, v, w, 
etc., possibly indexed. 

2.1 Pebble automata 

Definition 2.1. (See [Neven et al. 2004, Definition 2.3]) A two-way alternating 
k-pebble automaton, (in short fc-PA) is a system A = (Q, qo, F, /j,, U) whose compo- 
nents are defined as follows. 

(1) Q, q € Q and F C Q are a finite set of states, the initial state, and the set of 
final states, respectively; 

(2) U C Q — F is the set of universal states; and 

(3) fi is a finite set of transitions of the form a — > (5 such that 

— a is of the form (i, P, V, q), where i £ {1, . . . , k}, P, V C {i + 1, . . . , k}, q £ Q 
and 

— (3 is of the form (q, act), where q £ Q and 

act £ {left, right, stay, place-pebble, lift-pebble}. 

The intuitive meaning of P and V in (i,P,V,q) is that P denotes the set of 
pebbles that occupy the same position as pebble i, while V the set of pebbles 
that read the same symbol as pebble i. A more precise explanation can be 
found below. 



* Monadic NP is a complexity theoretic name for existential monadic second order logic. 
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Given a word w = a\ ■ • ■ a n E D* , a configuration of A on <nv> is a triple 7 = 
[i, q, 9], where i £ {1, . . . , k}, q E Q and 9 : {i, i + 1, . . . , k} — > {0, 1, . . . , n, n + 1}. 
The function defines the position of the pebbles and is called the pebble assignment 
of 7. The symbols in the positions and n + 1 are < and >, respectively. That is, 
we count the leftmost position in w as position 1. 

The initial configuration of A on w is 70 = [k, r/o,#o], where 9o(k) — is the 
initial pebble assignment. A configuration [i, q,9] with q £ is called an accepting 
configuration. A transition (i,P, V,p) — > /3 applies to a configuration \j,q,9], if 

(1) i = j andp = <?, 

(2) P={l>i\ 9(1) = 9{i)}, and 

(3) V = {l>i\ a S (i) = a e(i) }. 

We define the transition relation h x on <sw> as follows: [i,q, 9] \~a,w [i 1 , q' , 9'], if 
there is a transition a —¥ (p, act) E /U that applies to [z, c/, 0] such that q' = p, for 
all i > i, 9'(j) = 9(j), and 

- if act = left, then i' = i and 6'(i) = 9(i) - 1, 

- if act = right, then i' = i and 9'(i) — 9(i) + 1, 

- if act = stay, then i' = i and 9'(i) — 9{i), 

- if act = lift-pebble, then i' = i + 1, 

- if act = place-pebble, then i' — i — 1, 9'(i — 1) = and 6*'(z) = 

As usual, we denote the reflexive, transitive closure of h A)W by h^ iU) . When the 
automaton and the word w are clear from the context, we shall omit the subscripts 
A and w. For 1 < i < k, an i-configuration is a configuration of the form [i,q,9], 
that is, when the head pebble is pebble i. 

Remark 2.2. Here we define PA as a model of computation for languages over 
infinite alphabet. Another option is to define PA as a model of computation for 
data words. A data word is a finite sequence of S x D, where E is a finite alphabet 
of labels. There is only a slight technical difference between the two models. Every 
data word can be viewed as a word over infinite alphabet in which every odd position 
contains a constant symbol. In the context of our paper, we ignore the finite labels, 
thus, Definition 2.1 is more convenient. 

We now define how pebble automata accept words. Let 7 = [i, q, 9] be a config- 
uration of a PA A on a word w. We say that 7 leads to acceptance, if and only if 
either q £ F, or the following conditions hold. 

— if q € U, then for all configurations 7' such that 7 h 7', 7' leads to acceptance. 
— if q £ F U U, then there is at least one configuration 7' such that 71-7' and 7' 
leads to acceptance. 

A word w € T>* is accepted by A, if the initial configuration 70 leads to acceptance. 
The language L(A) consists of all data words accepted by A. 

The automaton A is nondeterministic, if the set [7 = 0, and it is deterministic, 
if for each configuration, there is exactly one transition that applies. If act E 
{right, lift-pebble, place-pebble} for all transitions, then the automaton is 
one-way. It turns out that PA languages are quite robust. Namely, alternation and 
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two-wayness do not increase the expressive power to one-way deterministic PA, as 
stated in Theorem 2.4 below. 

Remark 2.3. In [Neven et al. 2004] the model defined above is called strong PA. 
A weaker model in which the new pebble is placed at the position of the head 
pebble, is referred to as weak PA. Obviously for two-way PA, strong and weak PA 
are equivalent. However, for one-way PA, strong PA is indeed stronger than weak 
PA. We will postpone our discussion of weak PA until Section 4. 

Theorem 2.4. For each k > 1, two-way alternating k-PA and one-way deter- 
ministic k-PA have the same recognition power. 

The proof of Theorem 2.4 is a straightforward adaption of the classical proof of 
the equivalence between the expressive power of alternating two-way finite state 
automata and deterministic one-way finite state automata [Ladner et al. 1984]. For 
this reason, we omit the proofs. 

The main idea is that when pebble i is the head pebble, due to the stack discipline 
imposed on placing the pebbles, all the other pebbles (pebbles i + 1, . . . , k) are 
fixed on their positions. Hence the transitions of pebble i, which are of the form 
(i, P, V, q) — > (p, act), can be viewed as transitions over the finite alphabet (P, V) £ 
2 {i+i,...,fe} x 2 {i+i,-,fe}. Thus, the idea in [Ladner et al. 1984] can be adapted to PA 
in a straightforward manner. The details are available as a technical report in [Tan 
2009]. In view of this equivalence, we will always assume that the pebble automata 
under consideration are deterministic and one-way. 

Next, we define the hierarchy of languages accepted by PA. For k > 1, PA& is the 
set of all languages accepted by fc-PA, and PA is the set of all languages accepted 
by pebble automata. That is, 

PA = |J PA fc . 

k>l 

2.2 Logic 

Formally, a word w = a\ ■ ■ ■ a n is represented by the logical structure with domain 
{1, . . . , n}; the natural ordering < on the domain with its induced successor +1; 
and the equivalence relation ~ on the domain {1, . . . ,n}, where i ~ j whenever 
a, t = aj . 

The atomic formulas in this logic are of the form x < y, y = x + 1, x ~ y. The 
first-order logic F0(^,<,+1) is obtained by closing the atomic formulas under 
the propositional connectives and first-order quantification over {l,...,n}. The 
second-order logic MSO(^, <, +1) is obtained by adding quantification over unary 
predicates on {1, . . . , n}. A sentence ip defines the set of words 

L(ip) = {w\w\=ip}. 

If L = L(ip) for some sentence tp, then we say that the sentence ip expresses the 
language L. 

We use the same notations FO(^, <, +1) and MSO(^, <, +1) to denote the lan- 
guages expressible by sentences in FO(~, <, +1) and MSO(^, <, +1), respectively. 
That is, 

FO(~, <, +1) = {L(ip) ip is an FO(~, <, +1) sentence} 
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and 

MSO(~, <, +1) = {L(ip) | ip is an MSO(~, <, +1) sentence}. 

Proposition 2.5. (See also [Neven et al. 2004, Theorem 4.1]) If (p G F0(~, < 
, +1) is a sentence with quantifier rank k, then L(ip) G PAfe. 

PROOF. (Sketch) First, it is straightforward that languages accepted by two-way 
alternating fc-PA are closed under Boolean operations. By Theorem 2.4, two-way 
alternating and one-way deterministic fc-PA are equivalent. Thus, the class PA^ is 
closed under Boolean operations. Therefore, it is sufficient to prove Proposition 2.5 
when the formula ip is of the form Qx^ipix^), where Q G {V, 3} and ip(xk) is a 
formula of quantifier rank k — 1. 

The proof is by straightforward induction on fc. A fc-PA A iterates pebble fc 
through all possible positions in the input word w. On each iteration, the automaton 
A recursively calls a (fc — 1)-PA A' that accepts the language L(tp(xk)) 7 treating 
the position of pebble fc as the assignment value for Xk . 

The transition in the PA A can test the atomic formula x — y and x ~ y; while 
at the same time remembering in its states the order of the pebbles. The word w 
is accepted by A, if the following holds. 

— If Q is V, then A accepts w if and only if A' accepts on all iterations. 

— If Q is 3, then A accepts w if and only if A' accepts on at least one iteration. 

This completes the sketch of our proof of Proposition 2.5. □ 

We end this section with Theorem 2.6 below which states that a language ac- 
cepted by pebble automaton can be expressed by an MSO(^, <, +1) sentence. 

Theorem 2.6. ([Neven et al. 2004, Theorem 4.2]) For every PA A, there exists 
an MSO(^, <, +1) sentence ip^ such that L(A) = L(ip^). 

3. WORDS OF D* AS GRAPHS 

This section contains the main results in this paper: 

(1) The strict hierarchy of PA languages based on the number of pebbles. 

(2) The separation of MSO(~ , <, +1) from PA languages. 

(3) The separation of one-way deterministic RA languages from PA languages. 

All three results share one common idea: We view a word of even length as a 
directed graph. Recall that D is an infinite alphabet, and that we always denote 
the symbols in D by the lower case letters a, b, c, . . ., possibly indexed. 

We consider directed graphs in which the vertices come from 2). A word w = 
flo&o • • • CL n b n G £>* of even length induces a directed graph G w — (V W ,E W ), where 
V w is the set of symbols that appear in w, that is, V w = {a : a appears in w}, 
and the set of edges is E w — {(an, bo), ■ ■ ■ , {a n , &„)}. We also write s w — ao and 
t w = b n to denote the first and the last symbol in w, respectively. For convenience, 
we consider only the words w in which s w and t w occur only once. 

As an example, we take the following word w = ab be bd cd ce de ef eg. Then 
s w = a and t w = g. The graph induced by w is the G w = (V W ,E W ), where V w = 
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{a,b,c,d,e,f,g} and E w = {(a, b), (b, c), (b, d), (c, d), (c, e), (d, e), (e, /), (e, 3)}, as 
illustrated in the picture below. 




We need the following basic graph terminology. Let a and b be vertices in a graph 
G. A paf/i of length m from a to b is a sequence of m edges (a^ , 6^ ) , . . . , (a im , b im ) 
in G such that = a, = b and for each j = 1, . . . , m — 1, &j . = <ii,- +1 • The 
distance from a to 6, denoted by &)> is the length of the shortest path from a 

to b in G. If there is no path from a to & in G, then we set dc(a, b) = 00. 

We now define the following reachability languages. For m > 1, 

Tl m = {w I d Gm (s w ,t w ) < m} 

and 

n = \J n m . 

m=l,2,... 

Here we should remark that since we consider only the words w in which s w and 
t w occur only once, the language Hi consists of words of length 2 with different 
symbols. 

Proposition 3.1. For each k = 2,3, . . ., "ft^-i e PA k . 

The proof of this proposition is an implementation of Savitch's algorithm [Savitch 
1970] for (s-t)-reachability by pebble automata. It can be found in Subsection 3.1. 

Lemma 3.2 below is the backbone of most of the results presented in this paper. 
For each i = 0, 1, 2, . . ., we define rn = 2 t+1 — 2. An equivalent recursive definition 
is n a = 0, and n i+ i = 2rij + 2, for i > 1. 

Lemma 3.2. For every k-pebble automaton A, where k > 1, there exist a word 
w G TZ nk and w ^ 71 such that either A accepts both w and w, or A rejects both w 
and w. 

The proof of Lemma 3.2 is rather long and technical. We present it in Subsec- 
tions 3.2 and 3.3. Meanwhile we discuss a number of consequences of this lemma. 
Corollary 3.3 below immediately follows from the lemma. 

Corollary 3.3. lZ rik PA k . 

Corollary 3.4. TZ £ PA. 

PROOF. Assume to the contrary that 1Z = L(A) for a k-PA A. Then, by 
Lemma 3.2, there exists a word w e TZ nk and w £ TZ such that either A ac- 
cepts both w and w, or A rejects both w and W. Both yield a contradiction to the 
assumption that TZ = L{A). □ 
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The following theorem establishes the proper hierarchy of the PA languages. 
Theorem 3.5. For each k = 2,..., PA k C PA k+1 . 

PROOF. We contend that TZ 2 k+i_-i E PA fe+ i - PA fe , for each k = 2, . . . ,3. That 
7\L 2 fc+i-i € PAfe + i follows from Proposition 3.1. That 7^ 2 *=+i-i ^ PA^ follows from 
the fact that n k = 2 k+1 - 2 < 2 k+1 - 1 and Lemma 3.2. □ 

Another consequence of Corollary 3.4 is that the inclusion of PA in MSO(^, < 
, +1) obtained in Theorem 2.6 is proper. 

Theorem 3.6. PA C MSO(~, <, +1). 

Proof. Without loss of generality, we may assume that MSO(^, <, +1) contains 
two constant symbols, min and max, which denote minimum and the maximum 
elements of the domain, respectively. For a word w = a\ ■ ■ ■ a n , the minimum and 
the maximum elements are 1 and n, respectively, and not and n + 1 which are 
reserved for the end-markers <i and i>. 

The language 1Z can be expressed in MSO(^, <, +1) as follows. There exist unary 
predicates S dd and P such that either 

— min + 1 = max A min <*< max (to capture IZi), 

or the following holds. 

— For all x, if x ^ min, then x min. 

(This is to take care our assumption that the first symbol appears only once.) 
— For all x, if x max, then x max. 

(This is to take care our assumption that the last symbol appears only once.) 
— Sodd is the set of all odd elements in the domain where mine S dd and max ^ S dd- 
— The predicate P satisfies the conjunction of the following FO(~, <, +1) sentences: 

— P C S dd and min G P and max — 1 e P, 

— for all x G P — {max — 1}, there exists exactly one y £ P such that x + 1 ~ y, 
and 

— for all x E P — {min}, there exists exactly one y E P such that y + 1 <~ x. 

Now, the theorem follows from Corollary 3.4. □ 

Remark 3.7. Combining Theorems 2.4 and 3.6, we obtain that MSO(~, <, +1) 
is stronger than two-way alternating PA. This settles a question left open in [Neven 
et al. 2004] whether MSO(~, <, +1) is strictly stronger than two-way alternating 
PA. 

Next, we define a restricted version of the reachability languages. For a positive 
integer m > 1, the language 7£+ consists of all words of the form 

CoCl v J^CiC2 v J^C 2 C 3 C TO _3C TO _2 C m _ 2 C m -l Cm-lCm 

Ui u 2 "m-2 «m-l 

where for each i E {0, . . . , m — 1}, the symbol Cj does not appear in iij and c, ^ Cj+i. 
The language 1Z + is defined as 

K + = |J K+. 

m=l,2,... 
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Remark 3.8. Actually, in the proof of Lemma 3.2 we show that for every fc-PA 
A, there exist a word w G TZ^ k and w 1Z + such that either A accepts both w and 
w, or A rejects both w and w. Therefore, 1Z + £ PA. 

The following theorem answers a question left open in [Neven et al. 2004; Segoufin 
2006] : Can one-way deterministic FMA be simulated by pebble automata? (We re- 
fer the reader to [Kaminski and Francez 1994, Definition 1] for the formal definition 
of FMA.) 

Theorem 3.9. The language 1Z + is accepted by one-way deterministic FMA, 
but is not accepted by pebble automata. 

PROOF. Note that TZ + is accepted by a one-way deterministic FMA with two 
registers.^ On input word w — c ci • • • c„_ic n , the automaton stores c\ in the first 
register and then moves right (using the second register to scan the input symbols) 
until it finds a symbol Cj = c\. If it finds one, then it stores Cj+i in the first register 
and moves right again until it finds another symbol cy = c i+ \. It repeats the 
process until either of the following holds. 

— The symbol in the second last position c„_i is the same as the content of the 

first register, or, 
— it cannot find a symbol currently stored in the first register. 

In the former case, the automaton accepts the input word w, and in the latter case 
it rejects. By Remark 3.8, the language 1Z + is not a PA language. This proves 
Theorem 3.9. □ 

3.1 Proof of Proposition 3.1 

In this subsection we prove Proposition 3.1. Before we proceed with the proof, we 
remark that when processing an input word w, an automaton A can remember in 
its state whether a pebble is currently at an odd- or even-numbered position in w. 
Moreover, we always denote the input word w by ao&o ' • ' «nin - that is, we denote 
the symbols on the odd positions by a^s and the symbols on the even position by 
bi's. We can also assume that the automaton always rejects words of odd length. 

We are going to construct a fc-PA A that accepts 7£ 2 *-i- Essentially the automa- 
ton A consists of the following subautomata. 

—An i-PA A\' 3 , for each i G {1, . . . , k - 1} and j,f G {i + 1, . . . , k}. 

The purpose of each automaton Aj' J is to detect the existence of a path < 2 l — 1 
from the vertex seen by pebble j to the vertex seen by pebble j'. 
—An i-PA A*' 3 , for each i G {1, . . . , k - 1} and j G {i + 1, . . . , k}. 

The purpose of each automaton A*' 3 is to detect the existence of a path < 2 l — 1 
from the vertex s w to the vertex seen by pebble j. 

—An i-PA Al'*, for each i G {1, . . . , k - 1} and j g + k}. 

The purpose of the automaton Af* is to detect the existence of a path < 2 l — 1 
from vertex seen by pebble j to the vertex t w . 

THere we use the definition of FMA as in [Kaminski and Francez 1994]. If we use the definition 
of RA as in [Segoufin 2006; Demri and Lazic 2009], then one register is sufficient to accept 1Z + . 
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We are going to show how to construct those subautomata Af 3 , A 3 '* and A* ,J by 
induction on i. 

The basis is i = 1. The construction of A{' 3 , A 3 {* and A*{ 3 is as follows. 

— The automaton A 3 { 3 performs the following. 

(1) It checks whether the symbols seen by pebbles j and f are the same, which 
means that there is a path of length from the vertex seen by pebble j to 
the vertex seen by pebble j'. 

(2) Otherwise, it iterates pebble 1 on every odd position in w checking whether 
there exists an index I such that ai is the same symbol seen by pebble j. If 
there is, it moves to the right one step to read bi and checks whether it is the 
same symbol seen by pebble j'. This means that there is a path of length 1 
from the vertex seen by pebble j to the vertex seen by pebble j' . 

— The automaton A*{ 3 simply puts pebble 1 on the second position of w to read 
bo and checks whether it is the same symbol seen by pebble j. (Here we use the 
assumption that s w occurs only once in w, which implies that there cannot be a 
path of length in this case.) 

— The automaton A{'* simply puts pebble 1 on the second last position of w to 
read a n and checks whether it is the same symbol seen by pebble j. (Here we use 
the assumption that t w occurs only once in w, which implies that there cannot 
be a path of length in this case.) 

For the induction step, we describe the construction of the automata A{' 3 , A{'* 
and A*' 3 as follows. 

— The automaton A 3 ' 3 performs the following. It iterates pebble i on each position 
in the input word w. 

(1) When pebble i is on the odd position reading the symbol ai, it invokes the 
automaton A\'\ to check whether there exists a path of length < 2 l_1 — 1 
from the vertex seen by pebble j to the vertex a;. 

(2) If there is such a path, it moves pebble i one step to the right reading the 

symbol bi. It then invokes the automaton A]'^ 1 to check whether there exists 
a path of length < 2*~ 1 — 1 from the vertex bi to the vertex seen by pebble j' . 
Now there exists a path of length < 2* — 1 from the vertex seen by pebble j to 
the vertex seen by pebble j' if and only if there exists an index / such that (i) 
there exists a path of length < 2 4_1 — 1 from the vertex seen by pebble j to the 
vertex a;, and (ii) there exists a path of length < 2 l ~ 1 — 1 from the vertex bi to 
the vertex seen by pebble j'. This implies the correctness of our construction of 
Af 3 '. 

— The automaton A*' 3 performs the following. It iterates pebble i on each position 
in the input word w. 

(1) When pebble i is on the odd position reading the symbol a;, it invokes the 
automaton A*^ to check whether there exists a path of length < 2 l ~ 1 — 1 
from the vertex s w to the vertex a;. 

(2) If there is such a path, it moves pebble i one step to the right reading the 
symbol bi. It then invokes the automaton A t i '^_ 1 to check whether there exists 
a path of length < 2 l ~ 1 — 1 from the vertex bi to the vertex seen by pebble j' . 
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It follows immediately that A*' J checks the existence of a path < 2* — 1 from the 
vertex s w to the vertex seen by pebble j. 
— The automaton Af* performs the following. It iterates pebble i on each position 
in the input word w. 

(1) When pebble i is on the odd position reading the symbol a/, it invokes the 
automaton A 3 i 'l 1 to check whether there exists a path of length < 2 i_1 — 1 
from the vertex seen by pebble j to the vertex a;. 

(2) If there is such a path, it moves pebble i one step to the right reading the 
symbol bi. It then invokes the automaton A^l^ to check whether there exists 
a path of length < 2 4_1 — 1 from the vertex bi to the vertex t w . 

It follows immediately that A*' 3 checks the existence of a path < 2 l — 1 from the 
vertex seen by pebble j to the vertex t w . 

Now the automaton A performs the following. It iterates pebble k on each 
position in the input word w. 

(1) When pebble k is on the odd position reading the symbol ai, it invokes the 
automaton A* k '_ 1 to check whether there exists a path of length < 2 fe ~ 1 — 1 
from the vertex s w to the vertex ai. 

(2) If there is such a path, it moves pebble k one step to the right reading the 
symbol bi. It then invokes the automaton A^'* 1 to check whether there exists 
a path of length < 2 k ~ 1 — 1 from the vertex bi to the vertex t w . 

Hence, A is the desired automaton for H 2 k -i an d this completes the proof of 
Proposition 3.1. 

3.2 Proof of Lemma 3.2 

The proof of Lemma 3.2 is rather long and technical. This subsection and the next 
are devoted to it. 

Recall that for each i G {0, 1,2,.. .}, we define Hi = 2 l+1 — 2. An equivalent 
recursive definition is no = 0, and rij = 2ni_i + 2, when i > 1. 

By Theorem 2.4, it is sufficient to consider only one-way deterministic PA A. Let 
A = (Q, qo, f-i, F) be a strong fc-PA. By adding some extra states, we can normalise 
the behaviour of each pebble as follows. For each i G {1, . . . , k}, pebble i behaves 
as follows. 

— After pebble i moves right and i > 1, then pebble (i — 1) is immediately placed 

(in position reading the left end-marker <) . 
— If i < k, pebble i is lifted only when it reaches the right-end marker > of the 

input. 

Immediately after pebble i is lifted, pebble (i + 1) moves right. 

We also assume that in the automaton A only pebble k can enter a final state and 
it may do so only after it reads the right-end marker > of the input. 
We define the following integers: f3 — 1, /3i = \Q\, and for i > 2,* 

A = |Q|! x A-i! 

t| denotes factorial. 
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For the rest of this subsection and the next, we fix the integers k and m, where k 
is the number of pebbles of A and m = (3k+i- 

We define the following graph G„ fc , m = (V nk , m , E nk ^ m ). The set Vfc jTrl consists of 
the following vertices. 

— ao, Oi, . . . , a njs ; 

bo, ■ ■ ■ , b nk -i', 
— Ci t i, . . . , c„ fc _i.i, for each i = 1, . . . , m — 1; and 
— • ■ • , e?„ fc _i,j, for each i = 1, . . . , m - 1, 

where a , . . . ,a nk ,bo, ■ ■ ■ , 6n fc -i,ci,i, . . . , c n( ._i, m _i, . . . ,d„ fc _i !m _i are all dif- 
ferent. The set Ek,m consists of the following edges. 

— (an, Ol), (oi, a 2 ), . . . , (fln t -i,«nj; 
— (6 , &i), (&i, fo), • • • , (b nk -2,b nk - 1); 

— (ci,i,C2,i), (c 2 ,i,c 3; j), . . . , (c nfc _2,i,c njk _i i i), for each i = 1, . . . ,m - 1; and 
— (di,i,d 2 ,i), (d2 t i,d 3> i), . . . , (d nfe _2,i,(i nfe _i i j), for each i = 1, . . . ,m - 1. 

Figure 1 below illustrates the graph G nk , m - 



a 


Ol 


02 


«3 


a nfc -3 


a nfc -2 


""ft -1 


i a„ fc 


Cl,l 


C2,l 


C3,l 


C4,l 


Cn fc -3,1 


Cn fc -2,1 


Cn k -l,l 




Cl,2 


C2,2 


C3,2 


C4,2 


Cn fc -3,2 


Cn fc -2,2 


Cn k — 1,2 





Cl,m-1 


C2,m-1 


C3, m -1 


C 4,TTl — 1 


c n k — 3 ,m — 1 


Cn^. — 2,m — 1 


c ra fc — 1 , m — 1 


to 


6l 


&2 


63 






6n fc -l 




C*2,l 


C*3,l 


C*4,l 




dnfe-2,1 


( d »fc-l.l 


di,2 


d2,2 


C*3,2 


C*4,2 


rfn fc -3,2 


rfre^ -2,2 


dn fc -l,2 


dl,m-l 


d2,m-l 


d3,m-l 


d4,m-l 


d« fc -3,m-l 


rfr» fc -2,m-l 


^n/. — l,m — 1 



Fig. 1. The full graph is the graph Gn k ,m- The graph depicted by w(nk,m) is also the above 
graph but without the nodes inside the dashed box and the edges adjacent to them. 

Now consider the following word w(nk,m): 

w(n,k,m) = aoaiCibobiDi a nk -2(in k -iCn k -ibn k -2bn k -iD nk 

(i) 
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where for each i = 0, 1, . . . , n k — 2, 

G% ' * ' Cj,m— 1 Q+l,m— 1 5 

-Di = (^,1(^+1,1 ' • ' dj jTO _i(ij_|_i im _i. 

This word w(n k ,m) induces the graph G„ fc!m , that is, G„,(„ fc , m ) = G nk>m and 

S ffl(»it,m) ^0 and t w {nk,m) a rik ' 

Now let 

w(n k ,m) = a a a 1 C 1 b b 1 D 1 a nfe _2an fe -iC nfe _i6„ fc _ 2 6„ ti -i. (2) 

That is, the word w{rik,m) is obtained by deleting the suffix D Ilfc _ia„ fc _ia Ilfc from 
w{n k ,m). 

The graph G W ( nktm ) is also illustrated in the graph in Figure 1, the graph 
Gw(n k ,m) is without the nodes inside the dashed box and the edges adjacent to 
them. 

and note that s^ ( „ feiTO ) = a and t W ( nfeiTO ) = &„ fc -i. Obviously, w{n k ,m) e TZ rik , 
while tU(rife,m) ^ 7?.. 

To prove Lemma 3.2, we are going to prove the following proposition. 

Proposition 3.10. The automaton A either accepts both w(n kl m) andw(nk,m), 
or rejects both w(n k ,m) andw(n kl m). 

The proof is rather complicated. It consist of five claims and their interdepen- 
dence is illustrated below. § 



Proposition 3.10 



Claim 1 




Claim 4 





Proof by induction 

The basis is proved as Claim 3 



Claim 2 




Claim 5 


t 



Proof by 
simultaneous induction 



In the proof we will need quite a number of notions which, for the sake of read- 
ability, are listed below one-by-one before we define them properly. 

— The notions of K(l) and L(l). 

— The notion of successor of a pebble assignment. 

— The notion of compatibility between two pebble assignments. 

§We are going to prove Claims 2 and 5 by induction simultaneously. This will be made precise in 
Subsection 3.3. 

ACM Transactions on Computational Logic, Vol. V, No. N, April 2012. 



Graph Reachability and Pebble Automata over Infinite Alphabets • 15 



The notions of K(l) and L(l). For I £ {0, 1, . . . , rife ~ I}, we define the integers 
K(l) and L(l) which are illustrated as follows. 

of length L(l) 

w(n k ,m) = a ai C\ boh D\ C ( _i &;_ 2 &i_i D l _ 1 a;_ia; C; A a;a; + i 

■H H 

of length 

4(m - 1) + 2 



h ■+« H 

of length K(l) of lcn § th 



of length K(l + 1) 



Formally, for I £ {0, 1, . . . , n k } 
K(l) 

and for I £ {0, 1, ... , rife}, 



o, a 1 = 

4m(Z-l) + 2, if Z > 1 



f if(Z + 1) - 2, if Z < rife - 1 
[ if (rife), otherwise. 



In particular, if (rife) is precisely the length of the word w(n,k,m) and L(0) = 0. 

The notion of successor of a pebble assignment. Let be an assignment of 
pebbles i, i + 1, . . . , k of A on a word to. That is, is a function from {i, i + 1, . . . , k} 
to {0, 1, . . . , \w\ + 1}. (Recall that positions and |io| + 1 contain the left- and right- 
end markers < and >, respectively.) If < 9{i) < \w\, we define Succi(9) = 9' , where 
for each j £ {i, i + 1, . . . , fc}, 



ffl(j) ifj>i + l 
\ 6»(«) + 1 if j = i 



The notion of compatibility between two configurations. Let i > 1 and 

0] and [r,g, 0] be configurations of A on w(rik,m) and rZJ(rifc,m), respectively, 
when pebble r is the head pebble. For an integer / £ {0, 1, ... , rife}, we say that the 
configurations [i,q,9] and [i,q, 9] are compatible with respect to I, if 

and for each j £ {r, . . . , fc}, 

—either 0(j) < if(/) or 0(j) > L(l + m); 

—either 9(j) < K(l) or > L(Z + rij) - 2m; 

-if 0(i) < Jf(/), then 9(j) < K(l) and 0(j) = 0(j); 

— if 0(i) < if (/), then 0(j) < if (/) and 9(j) = 0(j); 

—if 0(j) > L(/ + m), then 0(j) > L(Z + m) - 2m and 9(j) = 9(j) + 2m; 

—if 0(j) > L(Z + m) - 2m, then 0(j) > L(l + m) and 9{j) = 9{j) + 2m. 

Below we give an illustration of the compatibility of two configurations of an 8- PA 
on w(n%, m) and w(ng, m), respectively, with respect to I. The index I is I + n 5 . 
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K(l) L(l) 



w(n k ,m) 
w(n k ,m) 



_ ! © © o,_iO, 


No pebble here 


a t a e+1 © © 


, * , 

■ ' ' a i a i + i ^-2^-1 ' ■ ■ ae-ia? ■ ■ ■ b^_ 1 b£ • ■ ■ 


_!©(§) 0(_iOi 


■ • ■ aiai + \ be-2be-± ■ ■ ■ ai-\a£ ■ • ■ 


6<-i6< © © | 






- 2m 


» ' 

No pebble here L(£) 



© © © © arc pebbles 5, 6, 7, and 8, respectively. 



Claim 1. Suppose that [i,q,6] and [i,q, 6] are configurations of A on w(nk,m) 
and w(rik, m), respectively. If [i, q, 9} and [i, q, 9} are compatible with respect to some 
I S {0, . . . , nk), then 

(1) for all h E {0, ...,K(l + rn-\ + 2)} and for all p e Q, the configuration 
[i—l,p, 0U{(i—l, h)}} (on w(nk,m) ) and the configuration [i—l,p, 6U{(i— 1, h)}] 
(on w(rik,m) ) are compatible with respect to I + rij-i + 2; 

(2) for all h G {L(l + rij_i), . . . , K(rik)} and for all p G Q, the configuration 
[i — l,p, 9 U {(i — 1, h)}} (on w(nk,rn) ) and the configuration [i — l,p, 6 U {(i — 
l,h — 2m)}] (on w{nk,m) ) are compatible with respect to I. 

Proof. It follows from the fact that rii = 2n^_i + 2. We prove it by picture 
here. For case (1), the proof is as follows. Let /' = I + rij. 



K(l) K(l + n i - 1 + 2) L(l') 



w(n k ,m) = t- 
w(n k ,m) = n 



ai-iai 


a I + ni_l + l a I + ni_l+2 


No pebble here 
, " > 

b l>-l b l> ■ ■ ■ 


Oj/a,/ +1 


1 i 

ai-iai 


a ! + ni_l+l a ! + ni_l+2 




b l>-l b l< 


1 i 






- 2m 


Pebble i-1 is here No pebble 

here 

L(l') 



There is no pebble on the positions between K (I + rii_\ + 2) and L(') in the word 
w(nk,m) as well as on the positions between K{1 + rij_i + 2) and L(') — 2m in the 
word w(nk,m) due to the assumption that [i,q, 9] and [i,q, 9] are compatible with 
respect to I . Since I' — (I + n,-_ i + 2) = n,_i , the configuration [i — 1, p, 9U { (i — 1, h)}] 
(on w(nk,m)) and the configuration [i — l,p,9 U {(i — l,h)}] (on w(n k ,m)) arc 
compatible with respect to I + m-i + 2, for all h G {0, ... , K(l + m-i + 2)} and for 
all p G Q. 

For case (2), the proof is as follows. We let I" = I + nj_i. 

ACM Transactions on Computational Logic, Vol. V, No. N, April 2012. 



Graph Reachability and Pebble Automata over Infinite Alphabets • 17 



K(l) L(l") 



w(rik,m) = h 
w{n k ,m) = v 



ai-iai 


No pebble here 


Pebble i - 1 
is here 


b l"-l b l" ■ ■ ■ 


°z" a i" + l 


1 ' 

a-l-ia-l 




h"-ih" 


1 1 




1 


No pebble here 

Hi" 


Pebble i — 1 is here 
- 2rn 



There is no pebble on the positions between K (I) and L(") in the word w(nk,m) 
as well as on the positions between K (I) and L(") — 2m in the word w(nk, m) due 
to the assumption that [i, q, 9] and [i, q 1 9] are compatible with respect to I. Hence, 
case (2) follows immediately. This completes the proof of Claim 1. □ 

Remark 3.11. Let [i,q,9] and [i,q,9] be configurations of A on w(nk,m) and 
respectively and assume that they are compatible with respect to an 
integer /. Let e + 1, . . . , k} and let 

— x and y denote the symbols seen by pebbles j and j', respectively, on w(nk 7 m) 
according to the configuration 9, and 

— x and y denote the symbols seen by pebbles j and j', respectively, on w(nk,m) 
according to the configuration 9. 

Then x = y if and only if x = y . 

The reason is as follows. Since [i, q } 9] and [i, q, 9) are compatible with respect to 
I, we have the following four cases. 

(a) 9(j) < X(0 and 9(f) < KQ). 

In this case, 6(j) = 9(j) and 9(j') = 9(j') and we immediately have x = y if and 
only if x = y . 

(b) 9(j) < K and 9(f) > L(l_+m). 

In this case, 9(j) = 9(j) and 9(f) = 0(f) — 2m. Now in w(nk,m) and w(rik 7 m) 
each symbol appears at most twice and they are of distance 4m — 2 apart. Since 
L(l+rn)-K(l) > Am— 2, we have x ^ y. Similarly, L(l+rii)-2m-K(l) > 4m-2, 
hence x = y. 

(c) 9(j) > L(l + m) and 9(f) < K(l). 
The proof is similar to case (b) above. 

(d) 9(j) > L(l + m) and 9(f) > L(l + m). 

In this case, 9(j) = 9(j) — 2m and 9(f) — 6(f) — 2m and we immediately have 
x = y if and only if x = y . 

Now this immediately implies that for every transition a — >■ (3 of the automaton A, 
it applies to [i, q, 9] if and only if it applies to [i,q,0]. 

The following claim is important. However, due to the complexity of its proof, 
we postpone it until Subsection 3.3. 
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Claim 2. For each i € {1, . . . , k}, and for every run of A on w(nk,m): 

[i,PO,0 ] ^Xw(n k ,m) [hPuOl] ^Xw(n k ,m) ^~Xw(n k ,m) [i,PN+l,0 N+l] (3) 

where 

— N = K(rik) — length of w{rik,m); 
-0o(i) = 0; 
-0 N+1 (i) = N + l; 

— 6h+i = SucCi{6h), for each h s {0, . . . , N} - that is, for each j £ {i + 1, . . . , k}, 
6 (j) = ■■■ = 6 N+1 (j) and 6 h (i) = h, for each h £ {0, . . . , N + 1}; 

if I is an integer such that 

(1) if i = k, then I = 0; and 

(2) if i 7^ k, then I is an integer such that for each j 6 {i + 1, . . . , k}, either 
6{j)<K{l), ore(j)>L(l + ni ) + l, 

then there exist two positive integers vq and v such that 

— v = 7rft_i! ; where 1 < 7r < \Q\; 

—K(l + n;_i + 1) + 1 < vq < K(l + + 1) + ft; 

— for each h where v a < h < K(l + rn-i + 2) — v, we have ph = Ph+v 

In particular, since ft+i = |Q|! x ft! and m — ft+i, we have v divides ft+i, and 
thus v also divides m. Therefore, PK(i+n i - 1 +2)-2-2 m = PK(i+n i - 1 +2)-2- 

Below we give an illustration of the intuitive meaning of the indexes 1,vq,v in 
Claim 2 for i ^ k. Let I be the integer assumed in the hypothesis of Claim 2. (For 
simplicity, we do not put the indexes on the a's.) 



K(l) 



w(n k ,m) = \- 



L(l + m) 



aa 


Pebbles i+i,...,k are not here 


aa 


K(l + , 
aa 


U + 1 

L 


) K(l + , 



aa 


H +2) 


1 










1 



(*) 



The meaning of Claim 2 is that in region (*) pebble i enters the same state every 
v steps. 

Claim 3. Let 

[1,PN,6n] ^A,w(n k ,m) [1,PN+1,8n+i] 

be a run of A on w(nk,m), where N is the length of w(nk,m) and {1) = 0, and 
8j + i = Succi{9j), for each j G {0, . . . , N}; and let 

[1,^0, #o] ^A,w(n k ,m) [^,r M ,0 M ] ^A.w{n k ,m) [1, »"M+1 , #M+l] 
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be a run of A on W(rifc,m) ; where M is the length ofw(nk,m) and #o(l) = 0, and 
9j + i = Succi(8j), for each j G {0, . . . , M}. 

If [l,po, 0n] an d [1, r o, #o] are compatible with respect to an I G {0, . . . , rife — m}, 
then pn+1 = tm+1- 

Proof. Consider the run 

[l,P0,#u] )~A,w(n k ,m) [1,Pn,0n] ^A,w(n k ,m) [1, PN+1 , 0JV+l] , 

where #o(l) = 0, and = Succ\(6j), for each j G {0, . . . , N}; and the run 

[1)^0, <?o] ^A,w(n k ,m) [1,T"M,^m] l~x,«7(n fc ,m) [1> f"M+l, 0M+l], 

where 6> (1) = 0, and 0j + i = Succ\(6j), for each j G {0, . . . , M}. 

Suppose that [l,Po, #o] and [1, r , 0n] are compatible with respect to an integer I. 
This means that po = ro- We are going to show that pn+i — tm+i in three stages. 
(In the following let V = I + 2.) 

Stage 1. p K (V) = r K (i')- 

To prove this, we show that ph = rh, for each h G {0, . . . , K(l')}. The proof is 
by induction on h. The proof for the base case, h = 0, follows from compatibility 
of [l,po,#o] and [l,r ,# ]- 

For the induction step, suppose that ph = rh- By Remark 3.11, a transition 
a — > (3 applies to [l,ph, &h] if and only if it applies to [1, r^, 0h\- Hence, Ph+i = rh+i- 

Stage 2. Pk(V)-2 = VK{V)-1m-1 = r K(l')-2m-2- 

In Stage 1, we already show that p K (i')-2m-2 = r K (i')-2m-2- That p K (i')-2 = 
PK{i')-2m-2 follows from Claim 2. 
Stage 3. p N+1 = r M +\- 

We are going to prove that ph = r h -2rm for each h G {K(l') — 2, . . . , N + 1}. 

The proof is by induction on h. The proof for the base case, h = K(l') — 2, is 
already shown in Step 2. 

For the induction step, suppose that ph = r] x -2m- By Remark 3.11, a transition 
a -> (3 applies to [l,p h ,0 h ] if and only if it applies to [1, r h -2m, 9h-2m]- Thus, 
Ph+i = rh+i- 

This completes the proof of Claim 3. □ 

The following claim is the generalisation of Claim 3 which implies Proposi- 
tion 3.10. 

Claim 4. For each i G {1, . . . , k}, the following holds. Let 

[i,PO,0 ] \-Xw(n k ,m) [l,PN,0 N ] ^Xw(n k ,m) [h PN+1 , N+1 ] 

be a run of A on w(n k ,m), where N is the length of w(n kl m) and (i) = 0, and 
8j + i = Succi(9j), for each j E {0, . . . , N}; and let 

[i,ro,0o] ^,s (nt , m ) [i,r M ,e M ] ^Xw(n k , m ) [i,r M +i,6 M +i] 

be a run of A onw(nk,m), where M is the length ofw(nk,m) and 6q{i) = 0, and 
0j+i = Succi{9j), for each j G {0, . . . , M}. 

If [i,po,0o] and [i,ro,0o] are compatible with respect to an I G {0, . . . , rife — ni}, 
then p N+1 = r M +i- 
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Proof. The proof is by induction on i. The basis is i = 1, which we have already 
proved in Claim 3. 

For the induction hypothesis, we assume that Claim 4 holds for the case of i — 1. 
We are going to show that it holds for the case of i. The line of reasoning is almost 
the same as Claim 3. For completeness, we present it here. 

Consider the following run 

[i,PO,0 ] ^Xw(n k ,m) [i,PN,0 N ] ^ Xw(n k ,m) [hPN+l^N+l] 

and 

[i,r O ,0o] l~XtZ;(n fc ,m) ^\w(n k ,m) N> r M+l , #M+l] ■ 

By the assumption that [i,pn,#n] and [i,rn,#o] are compatible, we have po — rn. 
We are going to prove that pn+i = Tm+\ in three stages. Let V = I + nj_i + 2. 

Stage 1. p K(v) - 2 = r K {V)-2- 

To prove this subclaim, we show that Ph = r h , for each ft G {0, . . . , K(l')}. The 
proof is by induction on ft. The proof for the base case p = r follows from the 
fact that [i,Po,do] an d [i,r ,9 ] are compatible. 

For the induction step, suppose that ph = rv By the normalisation of the 
automaton A, the run is of the form: 

[i,p h ,0h] ^A,w(n k ,m) [i-l,PO)0o] ^Xw(n k ,m) ^Xw(n k ,m) [* ~ 1 > Pn+1 ) ^N+l] 

and 

[i,r h ,9 h ] \-A,w{n k ,m) [i — 1) r 0> ^o] ^AS(nii,m) ' _ X«'(ni i ,m) r M+l' ^M+l]i 

where — 1) = ft for each ft G {1, . . . , N + 1} and — 1) = ft for each 

fte {i,...,M + i}. 

By determinism of .A, we havep = Tq. Then, by Claim 1, since < ft < K(V), we 
have [i — l,p' , 6' Q ] and [i — l,r' , Q ] compatible with respect to By the induction 
hypothesis of Claim 4, we have p' N+1 = r' M+1 . Then, by determinism of A, we have 
Ph+i = rh+i- 

Stage 2. p K (V)-2 = PK{V)-im-i = r K (v)-2m-2- 

In Stage 1 we already have PK(i')-2m-2 = r K(i')-2m-2- Claim 2 implies that 

PK(l')-2 = PK(l')-2m-2- 

Stage 3. p N+ i = r M +i- 

By Subclaim B, we have Pk(V)-2 = r K(i')-2m-2- We are going to prove that 
Ph = r h - 2m , for each ft G {K{V) - 2, . . . , N + 1}. 

The proof is by induction on ft. The proof for the base case, ft = K{V), follows 
from Subclaim B. 

For the induction step, suppose that ph = r/,_2 m . By the normalisation of the 
automaton A, we assume that the run is of the form: 

[i,Ph,9h] l~A,w(nfe,m) [* — IjPO'^o] ^\w{n k ,m) ^~\w{n k ,m) l>PjV+l' ®N+l] 

and 

[i,rh-2m,6h-2m] \~A.,w(n k ,m) N -1 ; r 0> ^o] ^Xw(n k ,m) ^\w(n k ,m) [*> r M+l> ^M+l]i 
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where 8' h (i — 1) = h for each h G {1, . . . , N + 1} and 9 h (i — 1) = h for each 

fce{i,...,M + i}. 

That we have [i,^,^] ^^«,(„ fc , m ) [« - l,p ,6^] and [i,rh-2m,0h] \-*,w(n k ,m) 
[i — l,r' ,6 ] is due to the normalisation of the automaton A described in the 
beginning of Subsection 3.2. 

By determinism of A, we have p' = r' . Then, by Claim 1, since h > K(l'), we 
have [i — 1,p'q, 0' ] and [i — 1, r' , 9 ] compatible with respect to I. By the induction 
hypothesis of Claim 4, we have p' N+1 = r' M+1 . Then, by determinism of A, we have 

Ph+1 = rh+l-2m- 

This completes the proof of Claim 4. □ 

Proof, (of Proposition 3.10) We simply apply Claim 4, in which i — k, and 
both po,ro are the initial state qo of A. Note that the initial configurations of A 
on w(iik, m ) and w{rik, m) are the same, thus, they are compatible. □ 

3.3 Proof of Claim 2 

In this subsection we are going to prove Claim 2. The proof is also rather long and 
technical. We need the following definition. 

Definition 3.12. In the following, let i G {1, . . . , k}. 

(1) An assignment 9 : {i,...,k} M> {0,1,..., K(rik) + 1} of pebbles i, i + 1, . . . , k 
on w(nk,m) is called a pebble-i assignment. 

(2) For two pebble-z assignments 9\ and #2 , we say that they have the same pebble 
ordering, if for each j,f G {i, i + 1, . . . , k}, 6\{j) < 61(f) if and only if #2(7) < 
02(f). 

In this subsection we are going to prove Claim 2 together with Claim 5 below. In 
fact, we are going to prove both claims simultaneously. (We will give the structure 
of the proofs later on.) 

Claim 5. Let [i,q,Qi] and [i,q, 8 2 ] be configurations of A on w(nk,m) such that 

(1) 6\ and 9i have the same pebble ordering; 

(2) for each j e {*,... , k] , 9 t (j) < 9 2 (j); 

(3) there exist integers h,l2,h,h an d ^ such that l\ < l 2 < I3 < h and 1 < ir < 
0"^, and for each j G {i, . . . , k}, 

(aUfe^j) < K(h) or 8,(j) > L(l 4 ) + 1, then 8,(j) = 9 2 (j); 

(b) if6 2 (j) < K(h) or 9 2 (j) > L(U) + 1, then 8 1 (j) = 9 2 (j); 

(c) l 2 -h> rii-i + I; 

(d) h-l 3 > rii-i + 1; 

(e) lmage(8 1 ) n ({K(h) + 1, . . . , K(l 2 )} U {L(l 3 ) + 1, . . . , L(U)}) = 0; 

(f) lmage(8 2 ) n ({K(h) + 1, . . . , K(l 2 )} U {L(l 3 ) + 1, . . . , L(U)}) = 0; 

(.9) if0i(j) G {K(l 2 ) + l,...,L(l 3 )}, then9 2 (j) G {K(l 2 ) + 1, . . . , L(l 3 )} and 

9 2 (j)-9 1 (j)=Trf3 i - 1 \; 
(h) if9 2 (j) G {K(l 2 ) + l,...,L(l 3 )}, then9 x (f) G {K(l 2 ) + 1, . . . , L(l 3 )} and 

62(j) -6l(j) =7Tft_!!. 

If[i,q,6i] h* [i, p, Succi(9x)] and [i,q 7 8 2 ] h* [i, r, Succi(9 2 )\, thenp = r. 
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Below we give an intuitive meaning of Claim 5. Consider the following illustra- 
tion, where 6\ and 9 2 are configurations on w(rik, m) with the same pebble ordering. 

K(h) K(l 2 ) L(l 3 ) L(U) 





CbbD 




CbbDDaa 




aaCbbD 




CbbDD 








aa aa 


aa aa 


1 


CbbD 




CbbDDaa 




aaCbbD 




CbbDD 


1 


aa aa 


aa aa 


1 1 
















1 


region (b) 


No pebble here 


s v ' 

region (t|) 


No pebble here 


region (ft) 



The meanings of h,l2,h, U and n are such that for each j e {i, . . . , k}, 
— if pebble j are found in region (b) on both configurations 9\ and 9 2 , then 6\(j) = 

&(?■); 

— if pebble j are found in region (t|) on both configurations 9i and 62, then #2(7) — 

— if pebble j are found in region (jj) on both configurations 9\ and 62, then O2U) = 
Oi(j). 

On both configurations 9\ and 9 2 no pebbles are found in the region between K(l\) + 
1 and K(l 2 ) as well as in between L(l 3 ) + 1 and L{li). Claim 5 states that both 
configurations [i,q, 9\\ and [i,q,9 2 ] are essentially the "same." In the sense that if 
[i,q,0i] h* [i,p,Succi(9i)] and [i,q,9 2 ] h* [i, r, Succi(92)}, then p — r. 

The proofs of both Claims 2 and 5 use a rather involved inductive argument. In 
fact, we are going to prove both claims simultaneously by induction. The induction 
step on the proof of each claim uses the induction hypothesis of both claims. The 
overall structure of the proofs of both Claims 2 and 5 is as follows. 

(1) We prove the base case i = 1 of Claim 2. 

(2) We prove the base case i = 1 of Claim 5. 

(3) For the induction hypothesis, we assume that both Claims 2 and 5 hold for the 
case i. 

(4) For the induction step, we prove Claim 2 for the case i + 

This step uses the hypothesis that both Claims 2 and 5 hold for case i. 

(5) For the other induction step, we prove Claim 5 for the case i + 

As in Step 4, this step uses the hypothesis that both Claims 2 and 5 hold for 
case i. 

Proof of the base case i = 1 for Claim 2. Let / be an integer such that for 
each j e {2, . . . , fc}, either 9(j) < K(l) or 9(j) > L(l + 2), where the number 2 
comes from m = 2. 

The symbols in Ci + \bibi + iDi +1 are different from all the symbols seen by peb- 
bles 2, . . . , k. We are going to show that when reading Ci+\bibi + \Di + \, pebble 1 
enters into a loop of states. See the illustration below. 
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of length L(l + 2) 

w(n k ,m) = aoai ai-iai Q+i hh+i A+i ai+i a !+2 a;+2<Ji+3 ■•• 



u ^ • 

of length K(l) Wlth Pebble 1 reading Ci+l^i&i+l A+l 

the states of „4 becomes periodic 



On reading the segment Ci + \bibi + \Di + \, the transitions used are of the form 
(1,0,0, s) — > (s', right). Due to the determinism of the automaton A, there exist 
integers v n and v such that v§,v < \Q\ and for each h where v < h < K (I + 
rii-i + 2) — v, we have ph = Ph+v In particular, since f3 2 = |<3|! x we have v 
divides f3 2 . Furthermore, /3 2 also divides m = /3fc+i, thus, divides m, therefore, 

PK(l+n i - 1 +2)-2-2m = PK(l+n i - 1 +2)-2- 

Proof of the base case i = 1 for Claim 5. Suppose [l,q, 9\] and 2 ] 
are configurations of A on w(rik 7 m) and ?i, Z 2 , £3, '4, ^ are integers such that the 
conditions (1), (2), (3.a)-(3./i) above hold. Moreover, suppose also that 

[1, q, 9{\ h [l,p, Succi(fli)] and [1, q, 2 ] h [1, r, S«cci(fl 2 )]. 

We are going to show that p — r. 

By conditions (3.e) and (3./), there can only be three cases: 0i(l) < K(l\), 
K{1 2 ) + 1 < 0i (1) < L(h), and 0i(l) > L(l 4 ) + 1. 

Case 1. 6»i(l) < 

By condition (3. a), we have #i(l) = 02(1)- By conditions (1), (2), (3. a) and (3.&), 
for any j £ {2, . . . , k}, we have 

0i(j)=0i(l) if and only if 9 2 (j) = 9 2 (l). (4) 

By condition (3.c), l 2 — h > 1- Moreover, no symbol in C; 2 6; 2 _i6; 2 D; 2 • • • a nk -ia nk 
appears in andi • • • a^-ia^ , and by conditions (3.e) and (3./), no pebbles are placed 
on Ci 1 ■ ■ ■ ai 2 -iai 2 . Therefore, for any j £ {2, . . . , fc}, 

pebbles j and 1 read the same symbol in the configuration [1, q, 9{\ 

if and only if (5) 
pebbles j and 1 read the same symbol in the configuration [1, q, 9 2 ] 

Thus, by Equalities 4 and 5, the same transition applies to both [l,q, 9{\ and 
[1, q, #2]- Since A is deterministic, we have p = r. 
Case 2. K{1 2 ) + 1 < 0i(l) < L(l 3 ). 

That is, #2(1) = 9i(l) + 7r/3j_i!, where 1 < 7T/8j_i! < m. By the same conditions 
(3.g) and (3.h), for any j £ {2, . . . , fc}, 

0i(j)=0i(l) if and only if 2 (j)=0 2 (l). (6) 

By condition (3.c), / 2 — £1 > 1. Moreover, any symbol in C; 2 6; 2 _i6; 2 D; 2 • • • a nk -ia nk 
does not appear in anai • • • a^-ia^ . Therefore, for any j G {2, . . . , k}, if pebbles j 
and 1 read the same symbol in the configuration [1, q, 9{\, then if (Z 2 ) + 1 < #i(j) < 
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L(lz); and similarly, if pebbles j and 1 read the same symbol in the configuration 
[l,q,e 2 ], then K(l 2 ) + 1 < 6 2 (j) < L (h)- By conditions (3.5) and (3.ft), 9 2 (j) = 
^i(i) + ""A-i" Due to the definition of w(nk 7 m) 7 we have 

pebbles j and 1 read the same symbol in the configuration [1, q, 9\] 

if and only if (7) 
pebbles j and 1 read the same symbol in the configuration [1, q, 9 2 ] 

Thus, by Equalities 6 and 7, the same transition applies to both [l,q, 61] and 
[1, q, 9 2 ]. Since A is deterministic, we have p — r. 

Case 3. 0i(l) > L(l 4 ) + 1. 

The proof is similar to the one for Case 1 above, thus, omitted. 

This completes the proof of the base case i = 1 for Claim 5. 

The induction hypothesis. Both Claims 2 and 5 hold for case i. 

The induction step for Claim 2. We are going to show that Claim 2 holds 
for the case i + 1 . 

Suppose we have the following run: 

[i+l,Po,0o] ^Xw(n k ,m) h l !tu („ fc ,m) h l,™(n fe ,m) [*+l»PiV+l^iV+l] 

Let £ be the integer as stated in Claim 2. Since m > |<3|ft!, there exists a pair 
(77, 77') of indexes such that 

— K(l + n t + 1) + 1 < 77 < 77' < if (Z + n t + 2) - 2; 
— 77' — 77 = 7r/3j!, where 1 < 7r < |Q|; 

— P77 = Prt' ■ 

We pick such pair (77,77') in which 77 is the smallest. We claim that v = 77 and 
v = 77' — 77 are the desired two integers in Claim 2. 

We are going to show that for each h G {uq, • ■ • , if (Z + rii + 2) — 2 — v}, 

if = Ph+v, then = Ph+v+i- (8) 

Since by definition of v$ and ^, we already have p va = p VQ + v , this immediately 
implies that for each h E {vo, . . . , if (I + rii + 2) — 2 — z/}, ph = Ph+v 
To prove Equality 8, suppose ph = Ph+v Consider the following run: 

-[i + l,p h ,6 h ] h [*,«oAU{(i,0)}]; 

-[7, So, h U {(7, 0)}] h* • • • h* [7, S N+U h U {(7, TV + 1)}] 

-[i,s N+1 ,6 h U{(i,N+l)}} h [* + l,«',0 fc ] h [i + l.pfc+i.flfc+i]. 
and the following run: 

— [i + l,p h+l/ ,9 h+l/ ] h [i,t ,^+!/ U{(i,0)}]; 

-[7, t , u °)}] h* • • • h* [i, tjv+i, «fc+ v U {(7, TV + 1)}] 

— [i,t JV+ i,e h+I/ u{(i,JV + l)}] h [i + 1,*', h ] h [i + l.pfc+^i.fffc+^+i]. 

Since ph = Ph+v and „4 is deterministic, we have s = to- Our aim is to prove that 
sjv + i = ijv+i. To this end, there are a few steps. 
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Step 1 (Application of the hypothesis that Claim 5 holds for the case i). For each 
j G {0, . . . , K(l + rii-i + 2)}, we claim that Sj = tj. 

To apply the induction hypothesis that Claim 5 for the case i, we take the integers 

h = 1 + + 2 
l 2 = I + n % + 1 
h = h 
U = I + n i+ i 

Recall that I is the integer such that every pebble, except pebbles i and (i + 1), are 
located either < K (I), or > L(l + n i+1 ). Recall also that v = 7rft!. 

It is straightforward to show that l-± — l\ > rii-i + 1 and Z4 — Z3 > rij-i + 1, and all 
the conditions (1), (2) and (3.o)-(3./i) hold. Since s = to, applying the hypothesis 
for each j G {0, . . . , K(l + n^i + 2)} - that Claim 5 hold for the case i - we have 

Sj = tj. 

Step 2 (Application of the hypothesis that Claim 2 holds for the case i). For each 
j G {K(l + 7ij_i + 1) + 1, . . . , K(l + Ui-i + 2) — 2}, in the configuration [i, Sj,9h U 
{(«,i)}] the integer I satisfies the condition that each pebbles i + l,...,k are located 
either < K(l), or L(l + m). 

Applying the induction hypothesis that Claim 2 holds for the case i, there exist 
two integers v' and v' such that 
— K(l + n,_! +l) + l<i/ <K(l + + 1) + ft; 
-1 < v' < ft; 

— sj = s j+v >, for each j G {K(l + + 1) + 1, . . . , K(l + nj_i + 2) - v' - 2}. 
In particular, v' divides ft+i, by definition of ft+i, thus, Sj = Sj +V , for each 
j G {K(l + m- x + 1) + 1, . . . , K(l + n,_i + 2) - v - 2}. 

Similarly, we can show that tj = tj +v , for each j G {if (Z + n^ + ^ + l, . . . , K(l + 
n i _ 1 + 2)-u-2}. 

Step 3 (Application of the hypothesis that Claim 5 holds for the case i). For each 
j G {K(l + rii-i + 1) + vo, . . . , L(ni + 2 + rii-i + 1)}, we claim that Sj = tj +v . 

To apply the induction hypothesis that Claim 5 for the case i, we take the fol- 
lowing integers. 

h = I 

12 = I + rn-i + 1 

1 3 = l + Hi + 2 + 7li-l + 1 

1 4 = I + m+i 

It is straightforward to show that I2 — h > rij-i + 1 and l 4 — l 3 > + 1, and all 
the conditions (1), (2) and (3.a)-(3.ft) hold. 

From Steps 1 and 2, we already have 

SK(l+n i - 1 +i)+v = t K (l+ 

+ l)+i / 0+i' - t Ki l + 

SK^+m-i+rj+vo = SK{l+m-i+l)+v Q +v 
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Applying the hypothesis for each j 6 {K(l + nj_i + 1) + uq, . . . , L(rii + 2 + n^_i + 
1) — i/} - that Claim 5 hold for the case i - on the configurations [i, Sj,9h U {(i, j)}] 
and [i, tj+v, 9^ U {(i, j + v)}], we have Sj = tj+ v . 

Step 4 (Application of the hypothesis that Claim 2 holds for the case i). For each 
j e {K{1 + m + 2 + n 4 _i +l) + l,...,K(l + m + 2 + m-x + 2) - 2}, in the configu- 
ration [i, Sj, 9h U {(i, j)}] the integer Z + rti + 2 satisfies the condition that each peb- 
bles . . . , k are located either < K(l+m+2), or > L(l+rii+2+ni) = L(l+m+\). 

Applying the induction hypothesis that Claim 2 holds for the case i, there exist 
two integers v'q and v" such that 

—K(l + n t + 2 + + 1) + 1 < ^' < K(l + m + 2 + n t _ 1 + 1) + ft; 
-1 < v" < ft; 

— s j = s i+i>", for each j e {K(l + m + 2 + + 1) + 1, . . . , K(l + rii + 2 + n»_i + 
1) - v" - 2}. 

In particular, v" divides ft+i, and by definition of ft + i, thus, Sj = Sj +1/ , for each 
j £ {^(/ + n t + 2 + n,_! + 1) + 1, . . . , K(l + rii + 2 + + 2) - v - 2}. 

Similarly, we can show that tj = tj+ v , for each j e {-fi'C + rii + 2 + n,_i + 1) + 
1, . . . , + rii + 2 + rii_i + 2) — v — 2}. In particular, we have 

SK(l+n l +2+n i - 1 +2)-2 = tK(l+n i +2+n i - 1 +2)-2- 
By definition of £(•) and i^(-), this is equivalent to stating that 

SL(/+n i +2+n i _ 1 +l) = ^L(/+n i +2+n i _ 1 +l) • 

Step 5 (Application of the hypothesis that Claim 5 holds for the case i). For each 
j E {L(l + ni-i + 1), . . . , N + 1}, we claim that Sj = tj. 

To apply the induction hypothesis that Claim 5 for the case i, we take the integers 

h = I 

l 2 = I + Ui-i + 1 
h - h 

h = h + ni-i + 1 

It is straightforward to show that I2 — h > rij_i + 1 and I4 — Z 3 > rij_i + 1, and all 
the conditions (1), (2) and (3.a)-(3./i) hold. 

By Step 4, we already have s L{l+n . +2+n ._ 1+l) = t L(l+nz+2+nz _ l+1) . Applying the 
hypothesis for each j 6 {L(l + nj_i + 1), . . . , N + 1} that Claim 5 hold for the 
case i - on the configurations [i,Sj,8h U {(«,.?)}] and [i,tj,9h U {(i, j)}], we have 
s j = ^i- 

From here, as sn+i = tN+i and A is deterministic, we have s' — t'. And again, by 
the deterministism of A, this implies Ph+i = Ph+i+w This completes the induction 
step for Claim 2. 

The induction step for Claim 5. We are going to show that Claim 5 holds 
for the case i + 

Suppose [i + 1, q, 9{\ and [i + 1, q, 9 2 ] are configurations of A on w(nk,m) such 
that the conditions (1), (2), (3.a)-(3.g) above hold. 
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Consider the following run: 
-[i + l,q,0i] h [i,«oAU{(i,0)}]; 

-[i, so, 0h U {(*, 0)}] h* • • • h* [i, SN+U0! U {(*, TV + 1)}] 

— [i, ajv+i, 0i U {(*, TV + 1)}] h [i + 1, ^] h [i + l,p, S«cci+i(fli)]. 

and the following run: 

-[i + l,g,0 2 ] r- Mo,f 2 U{(i 1 0)}]; 

-[i, t , ^2 U {(i, 0)}] h* • • • h* [i, tjv+i, 02 U {(i, JV + 1)}] 
-[i,t N+1 ,6 2 U{(i,N+l)}} h [i + l,i',0 2 ] h [« + l,r,5ucc 4+1 (0 2 )]. 

We are going to show that p — r. It can be proved in a similar manner as in the 
proof of the induction step of Claim 2, thus, omitted. 

Briefly, the proof is divided into the same Steps 1-5 above. The reasoning on 
each step still applies in this induction step, and at the end we obtain sjv+i = tjv+ij 
thus, s' — t' and p = r. 

4. WEAK PA 

There is an analogue of our results from the previous section to another, but weaker, 
version of pebble automata. In the model defined in Section 2, the new pebble is 
placed in the beginning of the input word. This model is called strong PA in [Neven 
et al. 2004]. An alternative would be to place the new pebble at the position of 
the most recent one. The model defined this way is usually referred as weak PA. 
Formally, it is defined by setting 9'{i — l) = 9{i) (and keeping 6'{i) = 6{i)) in the case 
of act = place-pebble in the definition of the transition relation in Definition 2.1. 
We give the formal definition below. 

Definition 4.1. A two-way alternating weak k-pebble automaton, (in short weak 
fc-PA) is a system A = (Q, go, F, M, U) whose components are defined as follows. 

(1) Q, qo G Q and F C Q arc a finite set of states, the initial state, and the set of 
final states, respectively; 

(2) U C Q — F is the set of universal states; and 

(3) jj, is a finite set of transitions of the form a — > (3 such that 

— a is of the form (i, P, V, q), where i G {1, . . . , k}, P, V C {i + 1, . . . , k}, q £ Q 
and 

— (3 is of the form (q, act), where q 6 Q and 

act £ {right, place-pebble, lift-pebble}. 

The definitions of pebble assignment, configurations, initial and final configura- 
tions as well as application of a transition on configurations are the same as defined 
in the case of strong PA in Subsection 2.1. 

We define the transition relation on <w> as follows: [i,q, 8} \~^ w [i',q',9'}, if 
there is a transition a — > (p, act) £ fj, that applies to [i, q, 9] such that q' — p, for 
all i > i, 6'(j) = 9(j), and 

— if act = right, then i' — i and 9'{i) — 9(i) + 1, 
— if act = lift-pebble, then i' = i + 1, 
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— if act = place-pebble, then i' = i — 1, 9'(i — 1) = 0(i) and 0'(i) = 9(i). 

Note the difference on the definition of 8' for the case of act = place-pebble from 
the one in the case of strong PA in Subsection 2.1. 

Theorem 4.2. [Tan 2010, Theorem 3} For each k > 1, one-way alternating, 
nondeterministic and deterministic weak k-PA have the same recognition power. 

However, weak A;-PA is weaker than strong fc-PA. For example, 1Z 2 k_i is n °t a 
weak /c-PA language, see Lemma 4.3 below. 
Let 

wPAfe = {L | L is accepted by a weak fc-PA} 

and 

wPA = |J wPA fe 
fe>i 

The following lemma is the weak PA version of Proposition 3.1 and Corollary 3.3. 
Lemma 4.3. For each k = 1, 2, . . ., 1Z~£ G wPA k , but 1Z~£ +1 £ wPA k . 

PROOF. First, we prove that 1Z^ G wPA fe . The weak &;-PA A that accepts TZ\ 
works as follows. On an input word w — a &o ' • ' a n,bm it works as follows. 

(1) It places pebble k on the second position to read the symbol bo- 

(2) For each i = k — 1, . . . , 1, it does the following. 

(a) Place pebble i, and non-deterministically moves it right until it finds an 
odd position that contains the same symbol read by pebble i + 

(b) If it finds such position, it moves pebble i one step to the right. 

(c) If it cannot find such position, it rejects the input word. 

(3) If at the end, pebble 1 is on the last position, then the automaton accepts the 
input word. 

It is quite straightforward to show that the automaton Ak accepts 1Z\ . 

Now we prove that 1Z~l+i & w PAfc- Suppose to the contrary that there is a weak 
k-PA A that accepts TZ^+i- ^v adding some extra states, we can normalise the 
behaviour of each pebble as follows. For each i G {1, . . . , k}, pebble i behaves as 
follows. 

— After pebble i moves right, then pebble (i — 1) (when i > 1) is immediately placed 

(in position reading the left end- marker <). 
— If i < k, pebble i is lifted only when it reaches the right-end marker > of the 

input. 

Immediately after pebble i is lifted, pebble (i + 1) moves right. 

We also assume that in the automaton A only pebble k can enter a final state and 
it may do so only after it reads the right-end marker > of the input. 

We let m = Pk+i, as defined in Subsection 3.2, where /3n = 1, /3i = \Q\, and for 
i > 2, 

Pi = \Q\\ x 

ACM Transactions on Computational Logic, Vol. V, No. N, April 2012. 



Graph Reachability and Pebble Automata over Infinite Alphabets • 29 



Also recall that the words w(k + 1, to) and w(k + 1, to) are defined as follows. 

w(k + l,m) = a aiCib biDi a k ^ia k C k b k -ib k D k a k a k+ i 

w(k + l,m) = aoaiCibohDi a k -ia k C k b k -i_b k , 

where for each i = 1, . . . , k, 

^1=^1,1^1+1,1 '•• aYm-l^i+Lm-l- 

Obviously w(k + l,m) G ^fc+u while W(fc + l,m) ^ 7?. + . We establish the 
following claim that immediately implies TVl +1 ^ wPAfe. 

Claim 6. The automaton A either accepts both w(k + l,m) and «J(fc + l,m), or 
rejects both w(k + 1, to) and wJ(fc + 1, to). 

Proof. The proof is similar to the proof of Proposition 3.10. So we simply 
sketch it here. Let 

[k,PO,0o] ^Xw(n k ,m) h Xw(k, m ) [ fc > PN+1, N +l] 

be a run of A on w(k + 1, to), where AT is the length of u>(fc + 1, m) and Oj(k) = j, 
for each j G {0, . . . , N + 1}. 
Let 

[k,r ,6o] !~l,«j(fe+l,m) '~X«'(fe+l,m) [ fc : r M+l,#Af+l] 

be a run of *4 on w(k + 1, to), where M is the length of uJ(fc + 1, m) and 6j(k) = j, 
for each j G {0, . . . , M + 1}. 

Now po = ro, as both are the initial state of A. We are going to show that 
Pn+i — tm+\- It consists of three steps. 

Step 1. p m = r m . 

This step is similar to Claim 4 proved in Subsection 3.2. That is, suppose [k,q,9] 
and [k,q, 6] are configurations on w(k + l,m) and w(k + l,m), respectively, and 
< 0(fc) = 0(fc) < m. If 

h l, w (fe+l,m) [fc,P,^CC fe (0)] 

\- k ^A h Xw(fc+i,m) for.Succfe^)] 

then p = r.^ 

This step is similar to Claim 2 stated in Subsection 3.2. That is, there exist two 
integers vq and v such that for every h G {to + i>o> ■ • • i 2to — ^}, we have ph = Ph+v 
The main idea is that since the integer to is big enough, there exists an integer v 
such that on every v steps, pebble k will enter into the same state. The integer to 
is defined so that it is divisible by every possible such v, thus, implies p m = p2m- 
That r m = p m is deduced from the previous step. 



^The only difference between this proof and the proof of Claim 4 is that here the induction 
hypothesis is that for each 1 < i < k — 1, weak i-PA cannot differentiate between w(i + 1, m) and 
w(i + 1, m); while in Claim 4 the induction hypothesis is strong i-PA cannot differentiate between 
w(rii,m) and w(ni,m). 
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Step 3. p N+ i = r M +i- 
Here we make use of the fact that A is a weak PA. From previous step we have p2 m — 
r m . On the configuration [fc,j>2m,#2m] of A on w(k + l,m), pebble k only "sees" 
a\aiC2b\biDi ■ ■ ■ ak-iakCkbk-ibkDkakak+i; while on the configuration [k, r m ,6 m ] 
of A on w(k + 1, to), pebble k only "sees" &0&1-D1 • • • ak-\akCkbk-\bk- 
Since a\aiCib\biD2 - ■ ■ ak-\akCkbk-\bk^>kaka.k+\ and b biDi ■ ■ ■ ak-iakCkbk-ibk 
are essentially the same, we have P2m+i = f m +i. Similarly, from P2m+i = f"m+ii 
we also can conclude that p2m+2 = r m+ 2 and then p2m+3 = r m+3 and so on until 
we get p N+ i = r M +i- 

□ 

This completes the proof of Lemma 4.3. □ 

Lemma 4.3 immediately implies the strict hierarchy for wPA languages. 
Theorem 4.4. For each k = 1,2,..., wPA k C wPA k+1 . 

5. LINEAR TEMPORAL LOGIC WITH ONE REGISTER FREEZE QUANTIFIER 

In this section we recall the definition of Linear Temporal Logic (LTL) augmented 
with one register freeze quantifier [Demri and Lazic 2009]. Wc consider only one- 
way temporal operators "next" X and "until" U, and do not consider their past 
time counterparts. Moreover, in [Demri and Lazic 2009] the LTL model is defined 
over data words. Since in this paper we essentially ignore the finite labels, the LTL 
model presented here also ignores the finite labels. However, the result here can be 
adopted in a straightforward manner for the data word model. 

Roughly, the logic LTL^(X, U) is the standard LTL augmented with a register to 
store a symbol from the infinite alphabet. Formally, the formulas are defined as 
follows. 

—Both True and False belong to LTLj(X,U). 
— t is in LTLj(X,U). 

— If (p, tp are in LTL|(X, U), then so are -up, <p\J ip and (pAip. 

— If ip is in LTLf(X,U), then so is Xp. 

— If <p is in LTL{(X, U), then so is ! p. 

— If p, ip are in LTL|(X,U), then so is pUip. 

Intuitively, the predicate j" is intended to mean that the current symbol is the same 
as the symbol in the register, while I p> is intended to mean that the formula p 
holds when the register contains the current symbol. This will be made precise in 
the definition of the semantics of LTL|(X,U) below. 

An occurrence of t within the scope of some freeze quantification | is bounded 
by it; otherwise, it is free. A sentence is a formula with no free occurrence of J [. 

Next, we define the freeze quantifier rank of a sentence p, denoted by fqr(p). 

— fqr(True) = fqr(False) = fqr(t) = 0. 

fqr(X</?) = fqr(-i</?) = fqr(p), for every p in LTLf(X,U). 
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— fqr(</? V ip) = fqr(</5 hip) = fqr(ipXSip) = max(fqr(<^), fqr(^>)), for every <p and ip in 
LTLj(X,U). 

— fqr(| ^) = fqr(cp) + 1, for every ^ in LTLj(X,U). 

Finally, we define the semantics of LTLf (X,U). Let w = a\ ■ ■ ■ a n be a word. For 
a position I = 1, . . . , n, a symbol a and a formula p in LTLj(X, U), w, I \= a <p means 
that ip is satisfied by w at position I when the content of the register is a. As usual, 
w, I y= a p means the opposite. The satisfaction relation is defined inductively as 
follows. 

— w, I \= a True and w, I ^= a False, for all I = 1, 2, 3, . . . and a G S3. 

— w, I \= a <f V ip if and only if w, I ^ a ip or w, I Ha ip- 
— w, I \= a <p A ip if and only if w, I \= a f and w, I \= a ip- 
— w, I \= a -up if and only if w, I Ha p. 
— w, I \= a X<£ if and only if 1 < I < n and w,l + 1 \= a p. 

— w, I \= a <pWip if and only if there exists V > I such that w, I' \= a ip and w, I" \= a p, 

for aUi" = *,...,/' -1. 
— w, I \= a ip if and only if w, I \= ai p 
— w, I \= a t if and only if a = a;. 

For a sentence p in LTh\ (X,U), we write w, 1 |= </?, if w, 1 Ha </? f° r some a G 3). 
Note that since ip is a sentence, all occurrences of t in </? are bounded. Thus, it 
makes no difference which data value a is used in the statement w, 1 |= a y> of the 
definition of w, 1 |= <p. We define the language £(<p) by = {w \ w, 1 |= <£>}. 

Theorem 5.1. For euery sentence ip G LTL^(X,U), there exists a weak k-PA 
Atf,, where k = fqr{ip) + 1, such that L(A^) = L(ip). 

PROOF. Let ip be an LTL^(X,U) sentence. We construct an alternating weak k- 
PA A^, where k — fqr(ip) + 1 such that given a word w, the automaton A$ "checks" 
whether w,l \= ip- A^ accepts w if it is so. Otherwise, it rejects. 

Intuitively, the computation of w, 1 |= ip is done recursively as follows. The 
automaton A^ "consists of" the automata A v for all sub- formula of ip. 

— If ip — p V ip', then A^ nondeterministically chooses one of A v or A v ' and 

proceeds to run one of them. 
— If ip — ip A ip', then A^p splits its computation (by conjunctive branching) into 

two and proceeds to run both A v and A^ . 
— If ip = Xip, A^ moves to the right one step. If it reads the right-end marker >, 

then it rejects immediately. Otherwise, it proceeds to run A v . 
— If ip =t, then A^p checks whether the symbol seen by its head pebble is the same 

as the one seen by the second last placed pebble. If it is not the same, then it 

rejects immediately. 
— If ip =1 ip, then A^ places a new pebble and proceeds to run A v . 
— If ip = ipUip', then A^ repeatedly does the following. 

(1) It splits its computation (by conjunctive branching) into two. 

(2) In one branch it runs A v - 
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(3) In the other it moves one step to the right and starts on Step 1 again. 
It repeatedly performs (l)-(3) until it nondeterministically decides to run A v >. 
— If ip — -up, then runs the complement of A v . The complement of A v can 
be constructed by switching the accepting states into non-accepting states and 
the non-accepting states into accepting states, as well as, switching the universal 
states into non-universal states and the non-universal states into universal states. 

Note that since fqr(y>) = k, on each computation path the automaton A^p only 
needs to place the pebble k times, thus, A^ requires only (k + 1) pebbles. 
Now it is a straightforward induction on the length of <p to show that 

w, I \= a tp if and only if the configuration [i, q, 9] leads to acceptance, 

where 

-i = fqr(p) + 1; 

— q is the initial state of A v ; 

— 9 is a pebble assignment where 9{i) = I and 9{j) < I, for each j G • • • , k+1}: 

— a is the symbol seen by pebble (i + 1), if i ^ k + 1. (If i = k + 1, then a can be 
an arbitrary symbol.) 

From here, it immediately follows that L(A^,) = L(ip). □ 

Our next results deal with the expressive power of LTLf (X, U) based on the freeze 
quantifier rank. It is an analog of the classical hierarchy of first order logic based 
on the ordinary quantifier rank. We start by defining an LTLf(X,U) sentence for 
the language TZ^ defined in Section 3. 

Lemma 5.2. For each k = 1,2,3,..., there exists a sentence ipk i> n LTL\(X,U) 
such that L(ipk) — Ti-t an d ^KV'i) = 1/ an ^ ^KV'fc) = k — 1, when k > 2. 

Proof. First, we define a formula pk such that fqr^fc) = k — 1 and for every 
word w — di ■ ■ ■ d n , for every i = 1, . . . , n, 

ui, i <Pk if and only if di ■ ■ ■ al n E . (9) 

We construct pu inductively as follows. 

—(pi = X(-. |) A -i(X(X True)). 
— For each k = 1, 2, 3, . . ., 

p k+1 = X(-.t)Ax(lx((-.t)U(tA¥> fc ))) 

Note that since fqr(pi) — 0, then for each k = 1, 2, . . ., fqr^) = k — 1. 

It is straightforward to show that p>k satisfies Equation (9). The desired sentence 
V>/t is defined as follows. 

=1 (X(- t)A-.(X(X True))). 
— For each k = 2, 3, . . ., 

V> fc = l(X(-.t))Ax(lx((-.t)U(tA Vfc _i))) 

Obviously, fqr(f/>i) = 1. For k > 2, fqr^fc-i) = k — 2, thus, fqr(V'fc) = k — 1. □ 
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Lemma 5.3. For each k = 1,2,..., the language 71^+1 * s n °t expressible by a 
sentence in LTh\{X, If) of freeze quantifier rank (k — 1). 

Proof. By Lemma 4.3, TZ^ +1 is not accepted by weak k-PA. Then, by Theo- 
rem 5.1, 1Z^ +1 is not expressible by LTL^(X, U) sentence of freeze quantifier rank 
0-1). □ 

Combining both Lemmas 5.2 and 5.3, we obtain that for each k = 1, 2, . . ., the 
language TZk+i separates the class of LTLj;(X,U) sentences of freeze quantifier rank 
k from the class of LTL^(X,U) sentences of freeze quantifier rank (k — 1). Formally, 
we state it as follows. 

Theorem 5.4. For each k = 1,2,..., the class of sentences in LTL\{X,U) of 
freeze quantifier rank k is strictly more expressive than those of freeze quantifier 
rank (k — 1). 
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